Conference Proceedings

Effective Construction of Relative Lempel-Ziv Dictionaries

K Liao, M Petri, A Moffat, A Wirth

Proceedings of the 25th International Conference on World Wide Web | International World Wide Web Conferences Steering Committee | Published : 2016

Abstract

Web crawls generate vast quantities of text, retained and archived by the search services that initiate them. To store such data and to allow storage costs to be minimized, while still providing some level of random access to the compressed data, efficient and effective compression techniques are critical. The Relative Lempel Ziv (RLZ) scheme provides fast decompression and retrieval of documents from within large compressed collections, and even with a relatively small RAM-resident dictionary, is competitive relative to adaptive compression schemes. To date, the dictionaries required by RLZ compression have been formed from concatenations of substrings regularly sampled from the underlying ..

View full abstract

University of Melbourne Researchers

Grants

Awarded by Australian Research Council's Discovery Project scheme


Awarded by Victorian Life Sciences Computation Initiative on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian State Government, Australia


Funding Acknowledgements

This work was funded by the Australian Research Council's Discovery Project scheme (DP140103256), and by the Victorian Life Sciences Computation Initiative (VR0280), on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian State Government, Australia.